Search results for "Conjugate prior"
showing 4 items of 4 documents
Solving two‐armed Bernoulli bandit problems using a Bayesian learning automaton
2010
PurposeThe two‐armed Bernoulli bandit (TABB) problem is a classical optimization problem where an agent sequentially pulls one of two arms attached to a gambling machine, with each pull resulting either in a reward or a penalty. The reward probabilities of each arm are unknown, and thus one must balance between exploiting existing knowledge about the arms, and obtaining new information. The purpose of this paper is to report research into a completely new family of solution schemes for the TABB problem: the Bayesian learning automaton (BLA) family.Design/methodology/approachAlthough computationally intractable in many cases, Bayesian methods provide a standard for optimal decision making. B…
Generalized Bayesian Pursuit: A Novel Scheme for Multi-Armed Bernoulli Bandit Problems
2011
In the last decades, a myriad of approaches to the multi-armed bandit problem have appeared in several different fields. The current top performing algorithms from the field of Learning Automata reside in the Pursuit family, while UCB-Tuned and the e-greedy class of algorithms can be seen as state-of-the-art regret minimizing algorithms. Recently, however, the Bayesian Learning Automaton (BLA) outperformed all of these, and other schemes, in a wide range of experiments. Although seemingly incompatible, in this paper we integrate the foundational learning principles motivating the design of the BLA, with the principles of the so-called Generalized Pursuit algorithm (GPST), leading to the Gen…
The design of absorbing Bayesian pursuit algorithms and the formal analyses of their ε-optimality
2016
The fundamental phenomenon that has been used to enhance the convergence speed of learning automata (LA) is that of incorporating the running maximum likelihood (ML) estimates of the action reward probabilities into the probability updating rules for selecting the actions. The frontiers of this field have been recently expanded by replacing the ML estimates with their corresponding Bayesian counterparts that incorporate the properties of the conjugate priors. These constitute the Bayesian pursuit algorithm (BPA), and the discretized Bayesian pursuit algorithm. Although these algorithms have been designed and efficiently implemented, and are, arguably, the fastest and most accurate LA report…
Generalized Bayesian pursuit: A novel scheme for multi-armed Bernoulli bandit problems
2011
Published version of a chapter in the book: IFIP Advances in Information and Communication Technology. Also available from the publisher at: http;//dx.doi.org/10.1007/978-3-642-23960-1_16 In the last decades, a myriad of approaches to the multi-armed bandit problem have appeared in several different fields. The current top performing algorithms from the field of Learning Automata reside in the Pursuit family, while UCB-Tuned and the ε -greedy class of algorithms can be seen as state-of-the-art regret minimizing algorithms. Recently, however, the Bayesian Learning Automaton (BLA) outperformed all of these, and other schemes, in a wide range of experiments. Although seemingly incompatible, in…